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Abstract 

Random instances of Constraint Satisfaction Problems (CSP's) appear to be hard for all known algorithms, 
when the number of constraints per variable lies in a certain interval. Contributing to the general understanding 
of the structure of the solution space of a CSP in the satisfiable regime, we formulate a set of natural technical 
conditions on a large family of (random) CSP's, and prove bounds on three most interesting thresholds for the 
density of such an ensemble: namely, the satisfiability threshold, the threshold for clustering of the solution space, 
and the threshold for an appropriate reconstruction problem on the CSP's. The bounds become asymptoticlally 
tight as the number of degrees of freedom in each clause diverges. The families are general enough to include 
commonly studied problems such as, random instances of Not- All-Equal-SAT, fc-XOR formulae, hypergraph 2- 
coloring, and graph fc-coloring. An important new ingredient is a condition involving the Fourier expansion of 
clauses, which characterizes the class of problems with a similar threshold structure. 
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1 Introduction 



Given a set of n variables taking values in a finite alphabet, and a collection of m constraints, each restricting a 
subset of variables, a Constraint Satisfaction Problem (CSP) requires finding an assignment to the variables that 
satisfies the given constraints. Important examples include fc-SAT, Not All Equal SAT, graph (vertex) coloring 
with k colors etc. Understanding the threshold of satisfiability/unsatisfiability for random instances of CSPs, as the 
number of constraints m = m{n) varies, has been a challenging task for the past couple of decades, with some notable 
successes (see e.g., |ANP05j ). On the algorithmic side, the challenge of finding solutions of a random CSP close to 
the threshold of satisfiability (in the regime where solutions are known to exist) remains widely open. All provably 
polynomial-time algorithms fail well before the SAT to UNSAT threshold. 

The attempt to understand this universal failure led to studying the geometry of the set of solutions of ran- 
dom CSPs MP Z021 lACOSj . as well as the emergence of long range correlations among variables in random satisfying 
assignments 1KM-|-07| . These research directions are motivated by two heuristic explanations of the failure of polyno- 
mial algorithms: (1) The space of solutions becomes increasingly complicated as the number of constraints increases 
and is not captured correctly by simple algorithms; (2) Typical solutions become increasingly correlated and local 
algorithms cannot unveil such correlations. 

By analyzing a large class of random CSP ensembles, this paper provides strong support to the belief that the 
above phenomena are generic, that they are characterized by sharp thresholds, and that the thresholds for clustering 
and reconstruction do coincide. 

1.1 Related work 

Building on a fascinating conjecture on the geometry of the set of solutions, statistical physicists have developed 
surprisingly efficient message passing algorithms to solve random CSPs. For instance, survey propagation [MPZ02| 
IMZ02j has been shown empirically to find solutions of random 3-SAT extremely close to the SAT- UNSAT transition. 
In order to understand the success of these heuristics, it has become important to study the thresholds for the 
emergence of so-called clustering of solutions - the emergence of an exponential number of sets (or clusters) of 
solutions, where solutions within a cluster are closer (in the Hamming sense, say), compared to the intra-cluster 
distance jMMZOSl IAR06[ |AC08| . Moreover, the fact that solutions within a cluster impose long-range correlations 
among assignments of variables, motivates one to study the so-called reconstruction problem in the context of random 
CSP's. Indeed, non-rigorous statistical mechanics calculations imply that the clustering and reconstruction thresholds 
coincide |MM06l [KM+OTl . 

Finally, understanding the threshold for (non)reconstruction is also becoming relevant (if not crucial) to under- 
standing the limit of the Glauber dynamics to sample from the set of solutions of a CSP. Indeed non-reconstuctibility 
was proved in |BK-|-05| to be a necessary condition for fast mixing, and is expected to be sufficient for a large class 
of 'sufficiently random' problems (GM07| . 

In a recent paper, Gerschenfeld and the first author [GM07| , considered the reconstruction problem for graphical 
models, which included the case of proper colorings of the vertices of a random graph. This amounts to understanding 
the correlation (as measured e.g. through mutual information) between the color of a vertex v, and the colors of 
vertices at distance > t from v. In particular, the problem is said to be 'unsolvable' if such a correlation decays to 
with t. We refer to Section [3] for a precise definition of the reconstruction problem. For a class of models, including 
the so-called Ising spin glass, the antiferromagnetic Potts model, and proper q-colorings of a graph, |GM07j derived 
a general sufficient condition, under which reconstruction for (sparse) random graphs G{n, m) with m = cn edges is 
possible if and only if it is possible for a Galton- Watson tree with independent Poisson(2c) degrees for each vertex. 
Moreover, they also verified that the condition holds for the Ising spin glass and the antiferromagnetic Potts at 
non-zero temperature, leaving open the case of proper colorings of graphs. 

1.2 Summary of contributions 

It is against this backdrop that we consider certain general families of CSP's - the first dealing with constraints 
consisting of fc-tuples of binary variables (as in fc- uniform hypergraph 2-coloring or Not- All-Equal (NAE) fc-sat), 
while the second dealing with g-colorings of vertices of graphs (which may be seen as an instance of a CSP with g-ary 
variables) - and study three important threshold phenomena. Our chief contribution is as follows. 
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(a) We formulate a fairly natural set of assumptions under which a general class of constraint satisfaction problems 
(including the models mentioned above) can be understood rather precisely in terms of the thresholds for satisfiability, 
clustering and (non)reconstruction phenomena. In particular we verify that the last two thresholds coincide within 
the precision of our bounds. 

(b) We consider tree ensembles (families of random CSP's whose variable-constraint dependency structure takes 
the form of a tree), and prove optimal bounds on the threshold for reconstruction on trees. These CSP's consist of 
binary variables, and the constraints are fc-ary, and the bounds are optimal to first order, as k goes to infinity. 

(c) We verify the sufficient condition of [GM07| for proper colorings of graphs, thus extending the reconstruction 
result for colorings on trees to the same on (sparse) random graphs. 

(d) By way of techniques, we make crucial use of the Fourier expansion of the (binary /c-CSP) constraints, after 
introducing an assumption on the Fourier expansion, as part of the random ensemble under consideration; this is 
key to being able to characterize the thresholds precisely. 

(e) Finally, as illustrative examples, we mention the specific bounds (on various thresholds) that follow for some 
standard models, such as the NAE /c-SAT, /c-XOR formulae etc. 

The organization of the paper is as follows. In Section [21 we give the formal definitions and assumptions of 
our models. We state our main results in Section [3l In Section |4l we state and prove the optimal bounds for the 
tree reconstruction problem. In Section [5l we verify the sufficient condition (from |GM07| ') for the specific problem 
of graph proper g-coloring, thus proving one of our main results - optimal bounds on the (sparse) random graph 
reconstruction problem for colorings. In Appendix \^ we derive a certain technical second moment bound that is 
needed for our work. 

2 Definitions 

In this section we define a family of random CSP ensembles: problems with constraints involving /c-tuples of binary 
variables and q-ary ensembles as a natural extension. We also introduce some analytic definitions that we will need 
in order to present our results. 

Binary k-CSP ensemble. Given an integer n, a G M+, and a distribution p = {p{ip)} over Boolean functions 
(fi : {+1, — l}*^ {0, 1}, CSP(n, a,p) is the ensemble of random CSP's over n Boolean variables x = {xi, . . . , Xn) 
defined as follows. For each a e {!,..., m — na}, draw k indices iai^), ■ ■ ■ ,ia{k) independently and uniformly 
at random in [n], and a function ipa with distribution p{ip). An assignment x satisfies the resulting instance if 
(pa(a;i^(i), . . . ,Xi^(^j^j) = 1 for each a e [m\. A CSP instance can be naturally described by a bipartite graph G (often 
referred to in the literature as a 'factor graph') including a node for each clause a S [to] and for each variable i G 
and an edge (i, a) whenever variable Xi appears in the a-th clause. 

q-ary ensembles. A q-aiy ensemble is the natural generalization of a binary ensemble to the case in which 
variables take q values. For the sake of simplicity, we restrict our discussion here to the case of pairwise constraints 
(i.e. fc = 2 in the language of the previous section). 

Given an integer n, a G ]R-|_, and a distribution p = {p{ip)} over Boolean functions ip : [q] x [q] {0,1}, 
CSPq{n, a,p) is the collection of random CSP's over g-ary variables Xi, for i = 1, 2, . . . , n, defined as follows. For 
each a € {1, . . . ,to = na}, draw 2 indices ia,ja independently and uniformly at random in [n], and a function i^a 
with distribution p{(p). An assignment x = (xi, . . . ,a;„) satisfies the resulting instance, if ipa{xi^, Xj^) — 1 for each 
a e [to]. 

In this paper, by way of illustrating how the results for binary ensembles could be (purportedly) extended to 
q-ary ensembles, we will exclusively study the q-coloring model which consists of ensembles with the single clause 
tp{x,y) = l{x ^y). This model corresponds to proper colorings with q colors of a random sparse graph with an 
edge-to- vertex density of a > 0. 

In the rest of this section, we briefly review some well known definitions in discrete Fourier analysis that are 
useful for stating our results. 
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Functional analysis of clauses. We denote by vq, the measure defined over {—1,+!}'^ such that v${x) = 

k 

n ( for every x G {—1,+!}'^. This is just the measure induced by choosing k independent copies of a 

i=l 

random variable that takes values ±1 and has expectation 6. Notice that when d = 0, v$ corresponds to the uniform 
measure over {—1, +1}''. 

The inner product induced by this measure, on the space of real functions defined on {—1, +1}'^ is denoted by 
(•, ■)g, and the corrcponding norm by ||-||^. If ^ = 0, we drop the subindex and just use (•, •) and ||-||, respectively. 
Thus, if /, 5 : {-1, +1}'= ^ M, then 

{f,9)e= E f{x)9{x)ve{x), \\f\\l= ^ fix)ve{x), 

a;e{-l,+l}'= a;e{-l, + l}'= 

(/,5) = 2-'= fi^)9ix), ll/f=2-'= Yl /'(^)- 

x6{-i,+i}'« xe{-i,+i}'= 

We denote the Hilbert space of functions {—1, +1}'^ M under the inner product (•, •) by J^. 

def 

Fourier transform of clauses. For any Q C [fc] = {1, .... k}. let Jq{x) = Jlieg ^i. Under the scalar product 
defined above (with 9 = 0), the functions {Tsjgcffe] form an ortlionormal basis for Jk- Moreover, they are exactly the 

algebraic characters of {—1, l}'^ with the group operation of pointwise multiplication. Thus, we define the Fourier 
transform of a function / e Jfe, by letting for any Q C [k], 

/q'= (7Q,/) = 2-'= E /(^)7q(^)- 
xe{-i,+i}'= 



Noise operator. Given ^ e [— 1, 1], we define the Bonami - Beckner operator T^i : — > J^, by 

Notice that {Tg f) (x) corresponds to the expected value of /(x^), where xg is obtained from x by flipping each 
coordinate independently with probability (1 — 6)/2. Notice that Ti is just the identity operator and Tq sends / to 
the constant fimction (/. 70). 

The Bonami-Beckner operator diagonalizes with respect to the Fourier basis, in the sense that {ToJq) (x) = 
'I7Q {x) for any Q C [k]. 

More generally, given h e [-1, 1] , we define (T^ /) (x) = E[/(xft)], where Xft is obtained from x by fiipping the 
i*^ coordinate independently and with probability . Since T/j also diagonalizes with respect to the Fourier basis, 
one gets {Th 7s) (x) = 7s (h) js (x) ■ 

Discrete derivative and influence. Given a function / S Jfc-i, we define its discrete derivative f^^^ G Jk-i as 
/(I) (x) = i [/ (l,x) - / (-l,x)]. We define analogously for any other variable index. Finally, the influence of 
the i^^ variable on / is defined using the norm of the derivative 



For any Q C [k], f^'' = fQu{i}- 

3 Main results 

3.1 Bineiry fc-CSP ensembles 

We assume the following conditions on the ensemble. 



def 
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1. Permutation symmetry. If ip^ is the Boolean function obtained from Lp by permuting its arguments, we require 
p[ip^) =p{ip). 

2. Balance. The distribution p is supported on Boolean functions such that (^(xi, . . . , x^) = ip{—Xi, . . . , — a^fc). 
This condition implies that the odd Fourier coefficients of ip are zero. 

3. Feasibility. For each Boolean function (p in the support of p, every partial assignment (xi, . . . ,Xk-i) can be 
extended to a satisfying assignment (a;o,a:i, . . . ,Xk-i) of ip. This condition implies that \\p\\'^ > 1/2, and together 
with the balance condition, implies that all the variables of p have the same influence, namely, 1^ ((^) = ''""^^'^^^ . 

4. Dominance of balanced assignments. For every e [—1, 1], 

logll^ll, <E^ logll^ll. 

This condition implies that, in a typical random instance, most solutions are balanced in the sense that they have 
almost as many +l's as — I's. 

While our ultimate goal is to exhibit results as fc —f c», the probability distribution p over the functions 
p : {—1,1}'^ {0,1} must be defined for every k, and some agreement should exist between such probability 
distributions for different fc's. In our work this agreement is given by two conditions concerning the derivative of the 
clauses in the support of p: 



(a) li norm of the Fourier transform grows at most polynomially in k. That is, for every ip £ supp(p). 



(1) 



for some constant a not depending on k. 

(b) 'Small weight' Fourier coefficients are small. There is a constant C > (not depending on k) such that for 
every p) G supp (p) , 



Te p^''> 



,0e[0, 1] 



The above implies in particular, that for any fixed £, there exists Ag > (independent of fc), such that 

1<\Q\<1 \Q\>1 

An equivalent formulation of Eq. ([2]) (with a possibly different constant C) is 
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, e [0, 1] 



(2) 



(3) 



(4) 



Results. An ensemble of binary fc-CSP's will be characterized by the following quantities. 

=^-E^log(||^||'). 



J_ dg ^ 2I1 (p) J_ def 



M 

Notice that flk < and Clk ~ ^k, whenever the influence is relatively small, or equivalently, when the norm is close 
to 1. 

Proposition 3.1 A random binary constraint satisfaction instance from the CSP(n, a,p) ensemble is satisfiable, 
with high probability, if a < as{k), where 

nk log 2 {1 + o{l)} <as{k) <hklog2 {1 + 0(1)}. 

Vice versa, if a > as{k){l + o(l)), then with high probability, a CSP(n,a,p) instance is unsatisfiable. 
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Given an instance of CSP{n, a,p), a cluster of solutions is any equivalence class of solutions under the (closure 
of the) relation a: ~ if dHamming(3I, s') ^ ^max for some dmax — o{n). The set of solutions is clustered if it is 
partitioned into exponentially many clusters. 

Theorem 3.2 The set of solutions of an instance from the CSP(n, a,p) ensemble is clustered, with high probability, 
if a > Q;d(fc), where 

ad{k) = -y {log k + o(log k)} . 

Given a measure /i(^) over variable assignments in {+1,-1}^, the reconstruction problem is said to be unsolvable 
if correlations with respect to fi decay rapidly with the distance r on G. More precisely, if fii^^r denotes the joint 
distribution of Xi and {xj : dG{i,j) > r}, then limr^oo hm sup„^3^ E||/ii.^r — AijAi~r||TV = 0. 

Theorem 3.3 Let ^i{x) be the uniform measure over solutions of an instance from the CSP(n, a,p) ensemble. The 
reconstruction problem is solvable for ^ if a > a^{k), where 

ar(fc) = ^ {logfc + o(logA;)} . 
Vice versa, the reconstruction problem is unsolvable if a < ai (fc). 

Thus, a key result of the present paper is that ad(fc) and ar(fc) do coincide for a large family of ensembles (up 
to the slackness, in the second order terms, of our bounds). 

Example: 2-coloring hypergraphs. Let us consider the ensemble of CSP's consisting of clauses of the type 
where (/? (xi, . . . , x^) = li^Xi ^ {~k,k}). The CSP(n, Q;,p) in this case, corresponds to the distribution of 2- 

colorings of a random hypergraph on n vertices and an edges, with edge size fc, and each edge chosen independently 

and uniformly at random. 

The conditions 1-3 clearly hold for this model and the dominance of balance assignments follows after checking that 
WvWe = ^^{^T ^{^T maximizes at 6* = 0. To establish the conditions H]), notice that ip^^ = -^[1 - (-1)'^'], 
which clearly implies that the li norm of the fouricr transform is bounded. To check notice that — ^ — 
(iM)'^-i _ (^2^)'"' < e-^-(i-'')/2 for all 6 e [0, 1]. 

An easy computation shows that ilk = — 1 and i = — log(l — 2"'^+^), therefore we have: 





Reconstruction - Clustering 


Lower bound satisfiability 


Upper bound satisfiability 


2-coloring 


^ [log k + o(log k)] 


2'^-ilog2[l4-o(l)] 


2'=-ilog2[l + o(l)] 



Example: Not All Equal fc— SAT. Let us consider now an ensemble of CSP's consisting of clauses of type 
{'/'sj^gj+i where tps {xi, ■ ■ ■ ,Xk) = I{J2xi^i ^ {~k, k}) and p{(ps) = 2~*'' for each s € {+1, —1}'^. In this case, 
the CSP (n, a,p) model corresponds to the distribution of NAE fc— SAT instances for a random formula in n variables, 
consisting of an random clauses, each with k literals. 

For this model, the conditions 1-3 are easily verified. The dominance of balance assignments follows from 

E, log II ^11, < logE, II ^11, = logE, (^1 - nti^^ - nli^^) = log II ^11 . 

On the other hand, the Fourier expansion of ips is given by (ps,Q = ^2^'^[7q(s) + 7q(— s)]. In particular |(/3s^q|^ — 
2^*^11 -I- (—1)'*^'], so that both Eqs. ([T]) and ([2]) hold along the same lines as the previous example. Indeed, in this 
case we get the same values for ilk and ilk, so that, we have: 





Reconstruction - Clustering 


Lower bound satisfiability 


Upper bound satisfiability 


NAE-SAT 


^ [logfc + o(logfc)] 


2'^-ilog2[l + o(l)] 


2'^-ilog2[l + o(l)] 
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Example: fc-XOR formulas. For an even integer k, the fc-XOR ensemble {k even) consists of clauses of type 
{ipe}^^i _i, where (p^ {xi, . . . ,Xk) ~ ^ (70 + e7[fc]) ■ In this case, the CSP (n, a,p) model corresponds to a system of 
an random linear equations in Z2, in which every equation involves k randomly chosen variables (with replacement) 
from a total of n possible variables. 

Conditions 1-3 hold for k even, and the dominance of balanced assignments condition follows from the fact that 
log ||</?||g = ^ log (^^—f — ^ , which is clearly maximized at 6* = 0. The condition on Fourier expansion of clauses for 

this model is straightforward: The Fourier expansion of ipf: is concentrated at and [k], so that the Eq. ([1]) holds 
with a = and the Eq. ^ holds with (7=1. 

In this case, wc have that £7^ = 1, while fl^ = l/log2. Therefore, we have: 





Reconstruction - Clustering 


Lower bound satisfiability 


Upper bound satisfiability 


XOR-SAT 


i [logfc + o(logfc)] 


log2 + o(l) 


1 + 0(1) 



We remark here that, in the case of XOR-SAT, the clustering and satisfiability thresholds can be determined 
exactly by exploiting the underlying group structure [MRZ031 |CD-|-03| (see |MM09] for a discussion of the recon- 
struction problem in XOR-SAT). 

3.2 Q'-ary ensembles: graph coloring 

The following result concerning the colorability and clustering of proper colorings were proved by Achlioptas and 
Naor |AN05j and Achhoptas and Coja-Oghlan [ACOSj . 

Theorem 3.4 ( Graph q- colorability \AN05f ) A random graph with n vertices and na edges is satisfiable with high 
probability if a < as{q), where 

asiq) = (?[logg + Og(l)] . 
Vice versa, if a > q;s(<z)(1 + Og(l)), such a graph is with high probability uncolorable. 

Theorem 3.5 (Clustering of q- colorings ^ AC08f ) The set of proper q- colorings of random graph with n vertices and 
na edges is clustered with high probability if a > ad{q), where 

(^d{q) = I [log g -I- o(log g)] . 

One of our main results is to prove a corresponding reconstruction theorem for this model as follows. 

Theorem 3.6 (Graph q-coloring reconstruction) Let iJ.{x) be the uniform measure over of proper q-colorings of 
random graph with n vertices and na edges. For q large enough, the reconstruction problem is solvable for fi if 
a > ai-{q), where 

a,{k) = |[log(7-}-loglog(7 + 0(l)] . 
Vice versa, the reconstruction problem is unsolvable, with high probability, if a < ai.{q). 

3.3 General strategy 

The results described in the previous section are of three types: bounds on the satisfiability thresholds, cf. Proposition 
13.11 and Theorem 13.41 on the clustering threshold, cf. Theorems 13.21 and 13.51 on the reconstruction threshold, cf. 
Theorems 13.3! and 13.61 The proof strategy is as follows. 

The satisfiability threshold can be upper bounded using the first moment of the number of solutions, and lower 
bounded using the second moment method. This technique is by now discussed in detail in |AM021 IAN051 [XNP05| : 
we describe its application to the general CSP(n,a,p) ensemble is done in Appendix [X] 

The clustering threshold can be upper bounded through an analysis of the recursive 'whitening' process that associates 
to each cluster a single configuration in an extended space |AR06| . The improved bounds in Theorems 13.2! and 13.5! 
can be obtained by approximating the CSP ensemble with an appropriate 'planted' ensemble [ACOSj . Since this 
approach is explained in detail in jACOS] . we will only present the various technical steps. 

The reconstruction threshold is characterized via a three-step procedure: 
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(1) Bound the reconstruction threshold for an appropriate ensemble of (infinite) tree instances, i.e. CSP instances 
for which the associated factor graph is an infinite Galton- Watson tree. In the case of proper (7-colorings, a sharp 
characterization was obtained independently by two groups in the past year |BVV07[ |Sly08| . In Section [3] we prove 
sharp bounds on tree reconstruction for binary CSPs. The proof amounts to deriving an exact distributional recursion 
for the so-called belief process, and carefully bounding its asymptotic behavior. 

(2) Given two 'balanced' solutions x!-'^\ x'^' (a solution is balanced if each possible variable value is taken on the 
same number of vertices), define their joint type 1/(2:, y) as the matrix such that the fraction of vertices i with xf'^ = x 

and x\ — y i& equal to v{x, y). Consider the number Z\,{v) of balanced solution pairs x_i, x_2 with joint type v. One 
has to show that E Zh{v) is exponentially dominated by its value at the uniform type V{x, y) = (with g = 2 for 
binary CSPs). More precisely EZb(t^) = exp{n<i>(z^)} with $ achieving its unique maximum at V. 

This is also a crucial step in the second moment method. It was accomplished in [AN05| for proper g-colorings 
of random graphs. In the case of binary CSPs, we prove this estimate in Section El 

(3) Prove that the above imply that the set of solutions of a random instance is, with high probability, roughly 
spherical. By this we mean that the joint type 1^12 of two uniformly random solutions a:^^),^^^) satisfies ||i^i2— i'IItv ^ ^ 
with high probability for all i5 > 0. Notice that this implication requires bounding the expected ratio of Z\^(y) to the 
total number of solution pairs. We prove that the implication nevertheless holds in Section [5] for g-colorings. The 
argument for binary CSP's is completely analogous, and we omit it. 

Finally, it was proved in |GM07| that, under such a sphericity condition, graph reconstruction and tree recon- 
struction arc equivalent, which finishes the proof of Theorems 13.31 and 13.61 

Notice that the techniques used for the clustering and reconstruction thresholds are very different. Thus it is a 
surprising (and arguably deep) phenomenon that they do coincide as far as the present techniques can tell. 

4 Tree ensembles and tree reconstruction for binary /c-CSP ensembles 

In this section we define tree ensembles and prove estimates about the corresponding tree reconstruction thresholds. 

4.1 The tCSP(a,p) ensemble 

The ensemble tCSP(a,p) is defined by a e M-|_ and a distribution p over Boolean functions : {—1, +1}*^ {0, 1}. 
We assume the conditions on the distribution p introduced in Section 13.11 An (infinite) instance from this ensemble 

is generated starting by a root variable node 0, drawing an integer = Poisson(fcQ;) and connecting to 77 function 
nodes {1, . . . , 77}. Each function node has degree fc, and each of its fc — 1 descendants is the root of an independent 
infinite tree. Finally, each function node a is associated independently, with a random clause drawn according to 
V- 

A uniform solution for such an instance is sampled by drawing the root value G { — uniformly at random. 
The values of descendants of each variable node i are then drawn recursively. If the function node a connects i to 
then the values x^^ , . . . , x^^, are sampled uniformly from those that satisfy the clause in a, that is, such 
that the quantity (xi, , . . . , Xi^_^ is equal to 1. 

By the balance condition, this procedure can be shown to be equivalent to sampling a solution according to the 
'free boundary Gibbs measure.' The latter is a distribution over solutions of the entire (infinite) tCSP formula defined 
by considering the unifom distribution over solutions of the first t generations of the tree, and then letting I ^ oo. 

4.2 Reconstruction 

Given any fixed tree ensemble T, let x be a random satisfying assignment for T according to the distribution described 
previously. We denote by the value of x at the variables at generation ^, and in the case that the root degree is 
1, we denote by xo,i, . . . , Xo,fc_i, the value at the variable nodes connected to the unique child of the root. Also, we 
use ryo for the root degree of T. If the tree ensemble T has root degree 770 = <^^ we denote by T^, i — 1, . . . , d, the 
subtree generated by the root, its i^^ children and its descendents. If 770 = 1, we denote by T/, i = 1, . . . , fc — 1, the 
subtree generated by the i^^ child of the root's child and its descendents. 

Finally, because the tree ensemble T could be random (for instance we denote by T a random tCSP (a,p)), we 
will use E for expectation respect to T, and (•) for expectation respect to x (given T) and E for expectation respect 
to any other independent random variable (adding, if not in context, a subindex to indicate such random variable). 
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Reconstruction: For a fixed tree ensemble T, let fi^ ^ be the joint distribution of (xojX^) and let /i^, , /i, be 
the marginal distribution of xq and x^ respectively. The reconstruction rate for T is defined as the quantity 
\\lJ'$,e (■) ■) ~ M0 (■) W (OIItv" reconstruction problem for T is tree-solvable if 

liminf ||/X0,<; (•,•) - /xg (•) W (OUtv > 

£— >oo * 

Analogously, if T is a random tCSP {a,p), we define the reconstruction rate of T as E (•, •) — /Ug (•) /U^ 
and we say that the reconstruction problem for T is tree-solvable 

liminf E ||/i0,^ (•, •) - /xg (•) /le (OUtv > 0- 

Sias, compatibility: Given a satisfying assignment for the variables at generation i, define the 'bias' of the 
root, restricted to the value of the variables at level £, as 

def 

hr {xi) = (xo |x^ = xi)j.. 

Throughout the next proofs we will study Ht (xe), for xi random and subject to different kind of distributions. Notice 
that under the balance condition ||/X0,£ (•, •) — /ig (•) fxe (OUtv ~ ^^^"^ i^^)\)T- 

def def 

Now, let Dt (xg) = {x} if hr (xi) — x, Dt (xg) = {—1, 1} if |/it (2;^)! < 1- Observe that Dt (xg) consists of the 
values of the root that are compatible with the assignment xg for the variables at generation I. 
Domain of clauses: Given a binary function ip (xq, . . . , Xk-i), define the partial solution sets 

S+ {(p) {(a;i,,a;fe_i) : <fi{l,xi,. . . ,Xk-i) = 1}, 
S- {(fi) {(a;i,,a;fe-i) : ^p{-l,xi, . . . ,Xk-i) = 1}, 

A+ iv) S+ {^) \S- iv) , A- iv) S- {^) \S+ (^) 

If the clause tp is balanced and feasible, we have that 15+ {ip)\ = \S- {ip)\ = 2^-^ \ipf and |A+ (<^)| = |A- {^p)\ = 
2'=Ii (^). 

Theorem 4.1 The reconstruction problem for the ensemble tCSP{a,p) is tree-solvable if and only if a > atree(fc) 

where ^ 

atree(fc) = {logfc + o(logA;)} . 

Proof. Upper bound: 

Given a tree ensemble T, the rate of 'naive reconstruction' for T is defined as 

def 

ze (T) = (I [hx (x^) = ( = (I [hr (x^) = — l])^ by the balance condition), 

which indicates the probability that a random assignment for the variables at generation £, distributed as x^, fixes the 
root to be equal to 1 (or —1). It is easy to see that (|/it {^e)\)T — (^)- Observe also, that for any x,y £ {—1, 1}, 

(I [hr (x^) = x] |xo = y)T = 2ze (T) 6^,y. (5) 

Thus, our objective is to show that in an appropiate regime of the parameter a, the quantity E [zi (T)] remains 
bounded away from zero as £ ^ 00, implying tree-solvability of the reconstruction problem in such regime. Indeed, 
this implies tree-solvability by 'naive reconstruction', i.e. by the procedure that assigns to the root any value 
compatible with the values at generation i. By notational convenience, define 

Zi (a) = 2E [zi (T)] and ze (a) = 2E [ze (T) ho = 1] • 
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Now, notice that for a tree ensemble T with root degree 770 = d, and any assignment xi for the variables at generation 
hx {xf) = 1 iff /iT {xt \Ti) = I for some i = 1, . . . , d, so that 

2zt{T)^ll-\{{l-l [/it. (x, \ T,) = 1]) |xo - 1 \ 
\ i=i It 

d 

= 1 - ]J ^(1 - I [/iT. (xf) = 1]) xo = 1^ (By the tree Markov property) 

d 



i=l 



Therefore, averaging over T, we get 

Zi (a) = Erj 



1 - n (1 - («)) 



rj ~ Poisson (ka) 



1 — exp {—kazi (a)) . 



On the other hand, given a tree ensemble T with root degree rjo — 1 and with the clause ip assigned to the root's 
child, we have that for any satisfying assignment xg for the variables at generation £, hx {xi) = 1 iff 



(6) 



where x^^^j^ is the assignment xi \ T/ for the variables at generation £ — 1 in the subtree T/. Observe that ^ 
holds, in particular, if for some a = (ai, . . . , ak-i) G A+ ((^), ft-y ^a^^^^i^ = for i = 1, . . . , A; — 1. Therefore, if 
y = (yi, . . . , yfc-i) denotes a random uniform vector from {ip), we have 



1X0 = 1 



aeA+(ip) \i=l 
fe-1 



= i ^ EyJ^ (l [/iT/ (x^-i) = flj] |xo = (By the tree Markov property) 



aeA+(ip) i=l 



|A+(^)| 



k-l 



which implies, after averaging over T, that 

Ze (a) > 



n2z^^i (i;') (By Eq. ®), 



2I1 (y^) 



(z£-i (a)) 



fe-i _ [ze-i (a)) 



k-l 



Now, it is standard to verify that this 



which leads to the recursion ze{a) > 1 — exp ^— fca (z£_i (a))'^ ^ /flk^. 

recursion implies that ze (a) is, for all £, greater or equal than the maximum of the fixed points of the function 
g {z) = 1 — exp (— fcaz'^^^/fifc) in the interval [0, 1]. The minimum value of a for which such fixed point is positive 
is given by 



nk i + u(i + ^ 



fe-2 



fc(fc-l) 

where u is the unique solution of the equation u — (fc — 1) log (1 + m). In particular, asymptotically in k, we have 
that ce* = ^ (log A: + o (log A:)), which implies the upper bound for atroc- 
Lower bound: 
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The matching lower bound on au-ccik) requires a more elaborate proof; we first prove three lemmas, before 
returning to complete the lower bound proof. □ 

Given a tree ensemble T, let = (x^ jxp = 1) and x^ = (x^ |xo = —1). When the tree ensemble is not clear in 
the definition of x^ (or x^), we add a subindex indicating the tree ensemble from where it is defined. Notice that, 
if /Lt"*" and /x~ are the distributions of xt and x7 respectively, then 



By the balance condition, it's clear that 



hr (xi 



1~ hr (xi) 



-hr (x„ 



(7) 



(8) 



Also, it is easy to show that (Jit (x^))^ = {^Ht (x^)] j (and therefore [Ri (T)] < (^hx (x^))^ < Ri (T)), so that 
non-reconstructibility for T is equivalent to the condition lim (Jit (x^))^ = (see jMP03j ). Similarly, if T is a 
random tCSP (a,p) ensemble, non-reconstructibility for T, is equivalent to the condition lim E [(ft-x ip^^^)'^ ~ 0. 

Lemma 4.2 (a) Given a tree ensemble T with root degree rjo = d, we have 



1 - hr (x+) 
1 + hr (x+) 



n 



1 + hi, 



(9) 



where (ft.z,i)^_j^ are independent random variables such that hi i = /i^. (x^) . 

(b) Given a tree ensemble T with root degree rjQ = 1 and with the clause assigned to the unique child of the 
root, we have that 

^l-hr (x+^J 



1 + Ht (x++J 



Th, y(-l,s) 
T/.,¥'(l,s) ' 



(10) 



where s Unit (5*+ ((/?)) and hi = {hi i)'^_^ are independent random variables such that hi i = h^! (x/^) 



Proof. This recursion follows straightforwardly from the recursive definition of tree formulae. The balance condition 
on clauses implies 

1 - hr (x+) _ (I [x;=x+] |xo = -l)^ 



1 + hr (x+) (I [x, x+] |xo = 1 )j, ■ 
Therefore, if the root degree of T is 770 = d, we have by the tree Markov property that 



1 - hr (x+) _ (I [x, = x+ \ T.] |xo ^ -1 
1 + hr (x+) l{ (I [xi = x+ \ T,] |xo = 1 ) 



and the last expression has the same distribution as 1-1-"''' ; due to the fact that (x^ \ Ti'j'^_^ are independent 



random assignments for the variables at generation / of T^, such that x^ \ Ti = x^^, . This proves Eq. Q. Now, if 

to be independent random assignments for the variables at generation 



the root degree of T is 770 — 1, define (^X; 

V 



fc-i 



of the subtrees T/, such that x^^ ^ ^tr'- ^^'^ ^^'^^ Markov property, we have that (x^j^ |" 27)*^_^^ = |^SiX 
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where s ^ Unif {(p) . Using once more the tree Markov property, we get 



hr (x; 



hx (x; 



which is precisely Eq. (jTU]). 



|xo = Vt 



y i=i 
T/.,y(-l,s) 



|xo = 



T' 



□ 



The first step of the above recursion can be analyzed exactly. 

Lemma 4.3 If T is a random tCSP(Q;,p) ensemble, then the random variable /it {^i) takes values in {0,1} and, 
if a < (1 - S){nk\ogk)/k, we have E/ix (x+) < 1 - k-^+^ . 



Proof. If T is a tree ensemble with root degree rjo = 1 and clause (p assigned to the root's child, from the part b of 
K have that !'^/^+\ = ("IjS) where s Unif (5*+ (ip)) (notice that Hq i = 1). Therefore, it follows 



that Ht (x^) = 1 w.p. - g+ly) • — ^/^k a-nd {^t) ~ ^ otherwise. Therefore, if T is a tree ensemble with root 
degree rjo = d, it follows from the part a of lemma that Ht (xj^) — 1 w.p. 1 — (1 — l/flk)'^ and Ht (x^) = 
otherwise. This implies then that Ht (xf ) is supported at {0, 1} and EHt {^t) = 1 ^ "^xp {—ka (1 — 1/ilfe)). The 
conclusion follows straightforwardly. 

□ 



For subsequent steps we track the averages, 
the following bounds. 



def 



E (Ht (x+))^ and =^ E [{h^ (x+))^ ho = l], using 



Lemma 4.4 For any £ >Q we have 
< 1 - e 

'(^(i),T, ^(D) 



ht+1 < 2 ^kihg 



m\ 



Rk{9) =^ 2E^i 



2Ii M 



<(IQI,2) 



QC[fc-l] 



(11) 

(12) 
(13) 



Finally, if hi is supported on non-negative values, then 

hr < Fk{hD . 

Proof. We will say that a random variable X G [—1, +1] is 'consistent,' if E/(— X) — E 

function / such that the expectation values exist. A useful preliminary remark |MM06| is that the random variable 
hx (x/^) is consistent (no matter the tree ensemble). In fact, this follows directly from the Eqs. ([7|) and ([8]) above. A 
number of properties of consistent random variables can be found in |RU08j . Let us now consider the first inequality. 
If T is a tree ensemble with root degree Tyo = it is immediate to from Eq. that 



/(X) 



for every 



'l-hr (x+)' 
1 + hr (x+) 



1/2, 



n 



1 - hr. (x+) 
1 + hr. (x+) 



1/2, 



Ti 



1/2 



It is possible to show that consistency implies E X = E and E [jj^ ) ~ -Ea/I — X"^ (through the test functions 
/ (x) = a; (1 + x) and / (x) = a; (1 + x)^^"^ (1 — x)^"^^^), we thus have 

^l-{hr{^t))T>{^^-[hT{4)f 



x{Ni-[hT.{^t)Y) >n 1-(^T.(X/-)),^ 



> 
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This implies in particular, if T is a random tCSP {a,p), 



n(l-E[(/.T(x+)), 1,0^1]) 



, rj ~ Poisson(fcQ;) 



from where the first inequality follows. 

Now, from the recursion Eq. (|10p . wc have for a tree ensemble T with root degree 770 = 1, and random clause (p 
assigned to the child of the root, 



hr (x+ 



i+l) 



2T„, ^(1) (S) def 

— — — , V S) = Vi^,s)Lp{~l,s) 

1 + Th, ijj (s) 



or alternatively. 



hr (x+ J = T;,, <^(i) (s) + (t^, <^(i) (s)) Gk ihi,s) , Gk {hi,s) 



def 



1 + T,,, ^ (s) 



where s ^ Unif 5"+ (1^). Notice that for any antisymmetric function / (s), we have that Eg / (s) = '^^p"^^- Therefore, 

due to the fact that T/j, tp^^^ (s) is antisymmetric and Gk ihi,s) is symmetric (both in s and hi, actually), we have 
the formulas 

2 //^(i) Tn.^^'Hs) 



and 



(/^T (X+ JV 



ii^ir 



(14) 



(15) 



In the last expression, the first term is equal to 
expansion, as 



IMF 



, while the second term can be writen, using Fourier 



M 



^2 E {'P^'\lQKhAlQihl)Gk{hi,-)]){^^'\lQ) 



QC[fc-l] 
IQI odd 



Using the fact that E |X| < (EX)^^^ for consistent random variables, we can bound the terms with \Q\ > 3 by 

1/2 



i[fe-i] \ieQ j 



QC[fe 
|Q|>3 odd 

Also, using the fact that for any even function / (x) with < / (x) < 1 and a consistent random variable X, we have 

|E[X/(X)]| = |E[2XV(X)/(l + X)I{x>o}]| < |E[2XV(l + X)I{x>o}]| = |IE[X]|, 
we can bound the terms with \Q\ = 1, by 



k-l 



E(^'^^7w)|(/^T.(x+)),^ 



Therefore, for a random tCSP {a,p) with root degree 770 = 1, we obtain after averaging 



(^W,T;,.ve(^(l)) 

\m 



211 (y) j2 I 

QC[fe-l] 
|Q|>3 odd 



1^11 



max{|Q|,2} 
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which is precisely the second inequahty in the Lemma. 

Now, suppose that hi is supported on non-negative values and let As — {hi : T/i; ip'--^') (s) > O}. Notice that the 
complement of Ag is —Ag (due to the antisymmetry of T/j, (/j^^^ (s) respect to hi). Therefore, using the consistency 
of the random variables hi^i, from the Eq. (|14p we get 



/h f + w ^ /f (1) ^ fa) T_,, (s) 



IIV'II 
2 



1 + T_,„ (s) 



l{-hi e As) 



< 



2 

2(^W,T<„)^^W (s)) 



I (hi e A) 



(1) (s) 
' 1 + Th, ^ (s) 



1 - hi, 



M 

Therefore, for a random tCSP {cx,p) with root degree rjo = 1, we obtain after averaging, that 

(^(l),T,.ve^(l)) 



which corresponds to the last inequality of the lemma. 



M 



□ 



We now return to completing the proof of Theorem 14.11 
Proof of the lower bound in Theorem 14.11 U 9 = 1, Ti is the identity operator whence (lya^^-', Ti(^(^') = Ii (ip). 
We have therefore -Fa.(1) = I /ilk- Now, expanding in Fourier series we get, 

I(v>7q)I^ 

QC[k],Q3{t} 



QC[/c-l] 



By the Fourier expansion condition, 

Fki9)<e-^'''^'-'>^/Qk. (16) 

Now fix a = (1 — (5)(ilfe logfc)/fc, whence, by Lemma [4.31 hf"^ < 1 — k^^^^ , and hi is supported on non-negative 
reals. Using Eq. (fT3)) . we get < e~^'' /Qk, and therefore. 



/i^^ < 1 - exp{-2(l - S)e-'-'' logfc} < 



On the other hand, from the Eq. ([3]), we obtain the following bounds for Fk{0), Rk {0): 



Fk[e) < 2E^ 
On the other hand. 



Ml' 



Rk{9) < 2E^i 



2Ii (V) 



k-l 



2Ii (y>) I / (1) \ 



< (Ae-'^*^/26'2 + r6'^)/l]fc, 



Therefore, for all £ we have 

hT+i < 1 - e-'="[^''(''")+^'=(''^'')l < (1 - 5)logfc(2Ae~^'^/2/ir + 2fc"(/if )3/2) , 
which implies ^ if, for some £ > 0, hf^ < k~^°- , thus finishing the proof. 



□ 
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5 Reconstruction on Trees to Graphs: the case of proper q colorings 

In this section we prove that the set of solutions of the proper g-coloring ensemble satisfies the sphericity condition 
described in the section [?75] 

Given two assignments x^^\ x^^^ of the variables xi, . . . , Xn, their joint type v^{i) ^(2) is the q x q matrix with 

'i'a;(i),x(2) ihj) {t e G : x^^'^ [t) = i and a;^^-* {t) = j}. We consider random assignments x'"'^': ISp''' taken uni- 

formly and independently over all the satisfying assignments of a random instance of the g-coloring model with 
edge-variable density a. Our purpose is to prove that for all J > 0, ||wx(i).x(2) ~ J^Htv < ^ w.h.p., where v is the 
matrix with all entries equal to Ijq^. 

Our argument makes crucial use of the following estimate for the partition function from [AC08| . 

Lemma 5.1 ( |AC08l Lemma 7]) Let Z he the number of satisfying assignments of a random instance of the q- 
coloring model with edge-variable density a < qlogq, then 



1 



q 



„(9-l)/2 

and, for some function f{n) of order o{n), we have Prob (Z < e-/(")E [Z]) as n ^ 00. 

Let us introduce some notation. If w is a vector of lenght q and v is a q x q matrix u, let Ti. and £ denote their 
entropy an their enrgy respectively, where 

T^i'") = - J2'"{hj)^ogv{i,j) , n{w) = -J2w{i) log w{i) 

£{v) = ^og(i~j:(j:v{hj)] +J:v{^,Jr\, ^M^iogii-E^w^ 



Let consists of all the g- vectors w with nonegative entries such that ^it; (i) = 1 and — > e. Similarly, let 

i 

^q'xq be the set of all the q x q matrices with nonegative entries such that \\{v — v) l\f < S, ||1* (v ~ v)\\ < S and 

|l«-«f>6. 

Our goal in this section is to prove the following theorem. 



Theorem 5.2 Let x*'^^ random assignments taken uniformly and independently over all the satisfying as- 
signments of a random instance of the q-coloring model with edge-variable density a. If a < {q — 1) log {q — 1), then 
for any e > 0, 

Prob ( ||wx(i) x(2) ~ ^^11^ > e ) as n ^ 00. 



We will present several lemmas before returning to the proof of the Theorem. First we introduce estimations 
concerning an additive functional depending on the energy and entropy of a vector of lenght q. 

Lemma 5.3 If w e B^, then n{w) + a£{w) < [H{w) + a£{w)] ~ 2(i-i/q) ■ 

Proof. Notice that [H{w) + a£{w)] — [H{w) -\- a£{w)] — a log ^ iZjj^jj:^ ^ ■ This quantity is bounded below by 
a\og(l-\- i-zT/^) , and therefore by 2{i-l/q) ■ ^ 



Lemma 5.4 Let be a random assignment of the variables taken uniformly over all the satisfying assignments of a 
random instance of the q-coloring model with edge-variable density a < qlogq. Then, for any e > 0, 

Prob (^||wx ^ ^) ^ ^ as n —^ 00 

where w is the vector with q entries such that (i) = — 7^ {w G G : x^, = i} and w is the vector with all entries equal 
to 1/q. 
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Proof. Given a property P, denote by Z{P), the number of satisfying assignments for shich P holds. Choose ^ such 
that f < °"; / \ . We have that 

^ 2(l-l/(}) 

/ . . . . O \ 

E 



Prob [\\wx — w\\ > e 
an expression that we can bound by 



Z{\\w^f >e + l/q^ /Z 



E 



Z (^\\wx\\^ > e + 1/q 



-"?E [Z] 



Prob {Z < e~"^E [Z 



Now, according to the Lemma [5.1[ Prob (Z < e "?E [Z]) 0, and therefore it is enough to show that the term 



E 



Z (||wxf > e + 1/q^ /e-"«E [Z] vanishes. 



Denote by the set of all vectors i, with nonegative integer entries, such that (^i/"-) = 1 



^ {Hi/n) > e + l/g, and denote by Vlw the set of assignments x such that Wx is equal to the vector w. Now, 
E 



Z ( ||it;x|| > + 1/9 ) = X2;eo^/„ Prob is a satisfying assignment) 



i=l 



(17) 



n — 1 



1 - E (^./")' 



< X Sg^-J^^exp (n [7^ (^/n) + c„£ [l/n)]) 
fee, 

< 3q2«\/^ |^;,| sup {exp [n [H {l/n) + c„£ (^/n)])} . 

Here \Q(\ is the number of elements of t/e, which is bounded by n''. Notice also that if £ e Q^, then £/n G S^, so that 
by Lemma |5.3[ 



= logg + alog(l - 1/q) 



(18) 



2(l-l/g)- 



On the other hand by the Lemma [5.11 there is some constant C such that 



e""«E [Z] > 



C 



7,('Z-l)/2 



91 



1 



Combining Eq. ((T7l) . (fT8|) and (|19p. we have that for a polynomial p (n) of degree 8(7/2, 



E 



From (PO)) . it is now clear that 



e-"«E [Z] 



< p{n) exp n 



ae 



e-"SE[Z] 



2(1-1/9). 

as n — > 00, due to the fact that ^ 



2(1-1/9) 



< 0. 



(19) 



(20) 



□ 



Next, our objective is to work with the quantity k*'"^, which we define as the upper limit of the interval (indeed, 
easy to see that this is an interval) consisting of the values c such that 

sup n{v) + c£[v) < n{v) + a£{v). 

To motivate, let us recall that an important part of the second moment argument of Achlioptas and Naor [AN05| 
Theorem 7] (in showing that the chromatic number x [G (n, d/n)] concentrated on two possible values), relied on an 
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optimization of the expression 'H{v) + a£{v) over the BirkofF polytope Bqxq of the q x q doubly stochastic matrices. 
In particular, they proved that, as long as a < {q — I) log{q — 1), one has 



sup Hiv) + a£ (v) = Hiv) + a£ (v) . 



(21) 



Since S^^g ^ Bqxq, we have k^'' > {q — l)log((7 — 1). The next lemma says that sup 'H{v) + a£{v) is in fact 
'separated' from Ti{v) + a£{v), provided that a < k^'"^. 



Lemma 5.5 Suppose that v G B^'^^ where e > 2S, then, if a < k^'^, we have that 

Mv) + a£[v)] < [niv] + a£iv)] - \ , [e - 2S] . 

2(1 - 1/q) 

Proof. Indeed, 

[Hiv) + a£{v)] - [n{v) + a£{v)] = [n{v) + kI-'£{v)] - \H{v) + + {n^ - a) {£{v) - 

1 



loK 1 



(l-l/9f 



\\v -v\\-\\{v ~v)\\\~\V {v ~v)\ 



> 



(4^ - 
2(1-1/9)' 



-26]. 



□ 



Lemma 5.6 Given e > and a < 



{q — 1) log (q — 1), there exists S > such that k^''^ > a. 



'■q 

Proof. Assume the contrary, then there exists a sequence 5n i such that k^"'' < a for each n. Due to the 



continuity of exp(7i(?j) + a£{v)) in the compact set Bq'^q, the supremum of exp(7i(u) + aq£{v)) is reached at a 

matrix vs^ G Sg^o ^ 'Pqxq, and due to the compactness of Vqxq, a subsequence \vs^ \ of these matrices 
^ ^ L J fc>i 



converges in T^gxg to a matrix v G B^'.^ . Therefore H-iv) + a£{v) < H-iv) + a£{v) 



2(1-1/9)" 



On the other hand. 



n{v) + a£{v)) > liminf H(u5„ ) + a£ (vg,^ ]>niv) + a£ (v) . 



obtaining a contradiction. 



□ 



Proof of Theorem 15.21 Given a property P, denote by Z^^^ (P), the number of pairs of satisfying assignments for 
which P holds. Take a' such that a < a' < {q — 1) log {q — 1) and use Lemma [5.61 to choose S such that k*'*^ > a' , 

guaranteeing also that 26 < e. Now, let ^ be a positive real such that 2^ < ^.^^"^^^ °L [e — 26]. We have that 



Prob I 



E 



^(2) 



" /e-2"«E[Z]', 



which is bounded by the addition of the terms E Z*-^-* ^Ux(i).x(2)G ^qxqj 

Prob (Z < e~"?E [Z]), Prob (^|| (wx(i),x(2) - l|f > and Prob ^||l* (wx(i),x(2) " > • ^O"^' Lemma EH] im- 
plies that the second term vanishes and lemma 15.41 implies that the last two terms go to zero. Therefore, to show 

"^('^ka),x(2)eS'xJl/e^'"«E[Z]2 



that Pi'ob ^||t;x(i),x' 
vanishes. 



(2) — v\\ > e 



is sufficient to prove that the term E 



16 



Denoting by Ge,5 the set of all g x g matrices L, with nonegative integer entries, such that L/n £ Sgxg, and 
denoting by the set of pairs of colorings xi, Xi such that v^^^x^i is equal to the matrix w, we have 



E 



'Ylix x'^en Prob {xi and a;2 are satisfying assignments) 



n — 1 



< iq^'^Vnexp{n[niL/n) + aE{L/n)]). 



And now, because k^'"^ > a' > a and L/n £ 'Sgxg where 2^ < e, we can invoke Lemma 15.51 to get that 



[nL/n) + aE{L/n)] < [H{v) + aE{v)] - ^ [e - 25] . 

2(1 - l/q) 



Therefore, 



E 



(t;x(„,x,.) e Sg^'^J] < 3g29V^|g,^,| _ l/q)"]^"exp 



where \Ge.s\ is the number of elements in Qe,5^ which is bounded by vfl . On the other hand by Lemma |5. 11 we have 
that for some constant C, 



-2n{ 



> 



c 



7,(9-1) 



q 1-- 

q 



2n 



Hence, for a polynomial p (n) of degree q^ + q — 1, we have 



E 



,5,e 



e-2<E [Z]^ 

Due to the fact that 2^ < "^^^^ [e — 26] , it is now clear that 



< p{n) exp i n ( 2^ - ^['^ [e - 26] 



2(l-l/gr 



as n ^ CX3. 



□ 
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A Constrained partition function for binary CSP's 



In this section, we prove Proposition 13.11 Given a random CSP(n,p, a) ensemble {Va}""!, consider the statistic 
Ln {if) = : 'Pa = f}, and denote by CSP(n,p, a;p„) the ensemble {<Pa}2Zi conditioned on L„ — pn-- Also, 

denote by CSP{n,p, a) the ensemble {(/?a}""i conditioned on ||L„ — < l/n^^^^'', where 7 is a fixed positive con- 

stant. Because Prob {\\Ln — p\\xv — ^/n^^^~'^) goes to zero (by the central limit theorem), the probability measures 
induced by CSP(n,p, a) and CSP(rt,p, a) become equivalent as n ^ 00. 

A binary configuration x is said to be balanced if |a; • i| < 1. We will use Z and Zi,, to denote the variable that 
counts the number of satisfying assignments and balanced satisfying assignments, respectively, of a random CSP 

ensemble. Given two binary assignments x^^\x^'^'> , we define their overlap as Q12 x^^^^ ■ xj-^^ /n = X]r=i ^^^''^PV"- 
In other words (1 — (5i2)/2 is the normalized Hamming distance of a:^^^ and xj-^'> . 



The upper bound in Proposition 13.11 follows from a first moment calculation. In fact, for a random CSP{n,p, a), 
we have 



Me 



Prob (Z = 0) < E [Z] = Prob (a; is a satisfying assignment) = Y[ 

< exp (^n I log 2 + aJ2p i^) log ||<pf + O (l/n^/^-T^ 

and the last quantity goes to zero whenever a > (1 + e) fifc log 2. 

To establish the corresponding lower bound, we use the second moment method, but first we need two lemmas. 



Lemma A.l Given a random CSP(n,p, a;p„) ensemble, let 2^b(|Qi2| > S) be the number of balanced solution pairs 
x^^\ xp'^ G {+1,-1}" with overlap larger than 6. Then, 



E [Zb(|gi2| > 6)] 



< n exp < n 



where 



def 



^9) =d H{9) + aE^^p„ log • 



sup $ (9) 

e>s 



and H{9) = -^f log(l + 9)-^ log(l - 9). 



Proof. For simplicity take n to be even. Let 95 be a boolean function, and let vr : [fc] — > [n] be a uniform random 
assignation for the variables in Lp. Now, given two balanced vectors x^^\ x^"^^ G {^1; l}"i we have 



where 9 — Qi2- Therefore, for some constant C > 0, 

9>SQi2=e 



< 



1 + 9 1 



1-9 1- 



4- aY^Ln i'p) log {(fi, Te ip) 



where Ti (•) is the entropy function. On the other hand, for some positive C", 



X balanced ^ 



C 



1 1 



> 7a72 ( <j w ( 2 ' 2 ) + "E^" if) log y 
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It is straightforward now to check that 



C" 



exp {n (6*)]) 



(22) 



and therefore ^ ^^i'g\"J-'^^ < nexp 



sup $ (6*) 



e><5 



□ 



Lemma A. 2 Given a random CSP(n,p,a;p) ensemble, if a < (1 — e)flk n log2, where tt^— E„^p ^iri^; then 
for any 5 > Q there exists C{S,e) > such that 

E [Zb(|Qi2| >5)] <e-"[^(^'^)l (EZb)^ 

Moreover, as S 0, C{5, e) = ((5^) . 

Proof. In view of the previous lemma, it is sufficient to prove that the function 9 ^-^ *^'(^) achieves its maximum 
over the interval [0, 1] uniquely at 6* = 0. To establish the second statement, then it will be enough to prove that 
-^{9) = n (9^) as 9^0. 

Fix a < (1 — e)flk log 2 < (1 — e)ilk log 2. We will prove the thesis claim by considering three different regimes 
for 6': < 9 < e^"''', e'^"-^ < 9 <\ — e^/^ and 1 — e^/^ < ^ < 1, where a is a small constant. In the first two intervals 
we will prove that the derivative of $(6') with respect to 9 is strictly negative. Recalling that > 1/2, we have 

— < -atanh^ + fcaE^ ' V ' 
d9 ||^||4 

<-9 + 2kaE, 9 + 2fcaE,Ml|! ^3 

\w\\ 

<-9 + Ae-^''-^9 + 2k-^9^ < -^9 + 4:k9^ , 

where we used (from Eq. ([2])) the hypothesis on low weight Fourier coefficients. The last expression is strictly negative 
if < 6* < e""^*^ for any a > and all k large enough. The previous formula also shows — $ {9) = (6*^) as 6* 0. 
Next assume e""*-' <9<l-e. Using the hypothesis {(p^^\Tg ip'-^^) < e~*^''(^~^) | | p, we have 

— < -atanh6> + 4fcQ;E^ e 
d9 ^ ||^||4 



< -atanhe* + 2/c-^e"^'=^ < -atanhg + 2 (log 2) ke'^''^ 



which is strictly negative if > e "'^ with, say, a — (Ce^)/2. Finally, we notice that, for I ~ < 9 < 1, any e small 
enough we have H{9) < — log2 + e/10. Further, using the fact that {(p, Tg Lp) = \\ T01/2 ipW^ is non-decreasing in 9 

m < - log 2 + ^ - aE^ log 1 12 ^ - log 2 + ^ + J < -ei^ , 

which finishes the proof. □ 

Conclusion of Proof of Proposition [3711 From the previous lemma we have that for any fixed S > 0, 



EZ2 E 



< 



^b(|Qi2| < i5/nf') 



(EZb)' (EZb)' 
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while a calculation analogous to that in Eq. (|22p and the fact that — $ (6) = fl (6*^), implies that 



E 



'^b(|Qi2| < iS/nf^') 



C" 



(EZb) 



Now, letting 5 ^ 0, it is clear that 



9<(5/ri)i/2 



(EZb 



tends to 1. This proves, by means of the Paley-Zygmund inequality, that for 
a < (1 — e) ( liminf ^k,ri^ ) log 2, a CSP(r7,,p, a;pn) ensemble is satisfiable w.h.p. The result extends straightforwardly 
for a random CSP(n,p, a), after noticing that ^k,L„ > (1 ^ e) ^k,p with high probability. □ 
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